Simple, readable sub-sentences
نویسندگان
چکیده
We present experiments using a new unsupervised approach to automatic text simplification, which builds on sampling and ranking via a loss function informed by readability research. The main idea is that a loss function can distinguish good simplification candidates among randomly sampled sub-sentences of the input sentence. Our approach is rated as equally grammatical and beginner reader appropriate as a supervised SMT-based baseline system by native speakers, but our setup performs more radical changes that better resembles the variation observed in human generated simplifications.
منابع مشابه
Sub-Sentential Alignment Method by Analogy
This paper describes a method for searching word correspondences between pairs of translation sentences. In the Example-Based Machine Translation, translation patterns can be extracted easily if word correspondences between pair of translation sentences are defined. The popular methods for aligning bilingual corpus at a sub-sentential level are unable to produce reliable result when the size of...
متن کاملDiscourse for Machine Translation
Statistical Machine Translation is a modern success: Given a source language sentence, SMT finds the most probable target language sentence, based on (1) properties of the source; (2) probabilistic source--target mappings at the level of words, phrases and/or sub-structures; and (3) properties of the target language. SMT translates individual sentences because the search space even for a single...
متن کاملA Random, Semantically Appropriate Sentence Generator for Speaker Verification
We describe two systems for automatically generating English sentences, and evaluate the suitability of their output for speaker verification. The first system, SUSGen, generates grammatical but semantically anomalous sentences of controlled length, vocabulary and phonetic content. The second system, SASGen, extends SUSGen to generate a greater variety of sentences and ones that are, for the mo...
متن کاملLearning to Explain Entity Relationships in Knowledge Graphs
We study the problem of explaining relationships between pairs of knowledge graph entities with human-readable descriptions. Our method extracts and enriches sentences that refer to an entity pair from a corpus and ranks the sentences according to how well they describe the relationship between the entities. We model this task as a learning to rank problem for sentences and employ a rich set of...
متن کاملSense Tagging In Action Combining Different Tests With Additive Weighangs
This paper describes a working sense tagger, which attempts to automatically link each word in a text corpus to its corresponding sense in a machinereadable dictionary. It uses information automatically extracted from the MRD to find matches between the dictionary and the Corpus sentences, and combines different types of information by simple additive scores with manually set weightings.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013